Internet Info 1997 December

home *** CD-ROM | disk | FTP | other *** search

/ Internet Info 1997 December / Internet_Info_CD-ROM_Walnut_Creek_December_1997.iso / ietf / urn / urn-archives / urn-ietf.archive.9608 / 000009_owner-urn-ietf _Tue Aug 13 15:55:41 1996.msg < prev next >

Wrap

Internet Message Format | 1997-02-19 | 8KB

Received: (from daemon@localhost) by services.bunyip.com (8.6.10/8.6.9) id PAA29056 for urn-ietf-out; Tue, 13 Aug 1996 15:55:41 -0400 Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.6.10/8.6.9) with SMTP id PAA29051 for <urn-ietf@services.bunyip.com>; Tue, 13 Aug 1996 15:55:36 -0400 Received: from mintaka.lcs.mit.edu by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA16249 (mail destined for urn-ietf@services.bunyip.com); Tue, 13 Aug 96 15:55:05 -0400 Received: from skadhwe.lcs.mit.edu by MINTAKA.LCS.MIT.EDU id aa28941; 13 Aug 96 15:54 EDT Received: by skadhwe.lcs.mit.edu; (5.65/1.1.8.2/15Aug95-0306PM) id AA03481; Tue, 13 Aug 1996 15:54:34 -0400 Date: Tue, 13 Aug 1996 15:54:34 -0400 Message-Id: <9608131954.AA03481@skadhwe.lcs.mit.edu> From: Lewis Girod <girod@LCS.MIT.EDU> To: jon@net.lut.ac.uk Cc: rdaniel@acl.lanl.gov, urn-ietf@bunyip.com In-Reply-To: <Pine.SUN.3.91.960813143052.8034o-100000@weeble.lut.ac.uk> (message from Jon Knight on Tue, 13 Aug 1996 15:02:24 +0100 (BST)) Subject: Re: [URN] nasty rewriting rules Sender: owner-urn-ietf@services.bunyip.com Precedence: bulk Reply-To: Lewis Girod <girod@LCS.MIT.EDU> Errors-To: owner-urn-ietf@bunyip.com On Tue, 13 Aug 1996 15:02:24 +0100 (BST) Jon Knight wrote Regexp interpreters are already available in source code format that you can just plug and play on many current platforms and which will ease porting to new architectures. Fair enough, although it is important to have agreement as to what flavor of regexps are being used. The original plan was to just execute sed, which solves both issues on UNIX platforms, but requires a little more work otherwise. > There are two questions here. First, this statement is only true > given that the structure of the name scheme actually remains > consistent with this set stucture in the context of NAPTR (this gets > back to point (1) above). If they were resolved with NAPTR using > regexps little would prevent someone from changing the format of an > ISBN in various ways; there is no technical impediment to this, and if > the top level ISBN naming authority doesn't want it to happen their > only recourse is legal. Surely though as long as the end users are using "legal" ISBNs in their URNs, the rewriting that happens is just the business of the people "owning" that ISBN and their resolution agents (might be one and same thing in some cases). If the resolution agents at the top level of a naming authority reject "illegal" constructs in that scheme that they might get given, people will soon get the idea and stop using them. Oops, thanks for catching that bug. Yes, it is sufficient to reject malformed URNs at the top level, and as you say that fixes the problem. As a rule, this can be done using regexps (although I have heard of no plans to do so). The fact remains that unless such syntax-checking regexps are actually implemented at the top level, there is no guarantee that a namespace will retain a set structure. If the top level naming authority never see the "illegal" rewritten versions then that's fine as it means that the rewriting is happening in private lower down the resolution tree. Remember, the rewrite rules are rewriting URNs into domain names, not into other URNs -- so a URN is either legal or illegal from the start. The concern is that names could be added to the namespace which don't conform to the name scheme as defined, but which would still be resolved by the system. If unpleasant contortions are required to perform the resolution downstream it may make the name scheme harder to migrate as a whole. The question is whether the flexibility that makes this possible is neccessary. I'd say that "enforcement" is just a private thing within a particular naming scheme resolution tree and isn't something we should be dipping our toes into (not if we want URNs in this lifetime anyway). There is some suggestion in the framework document that there be a set of requirements of new name schemes; any ``enforcement'' is limited to that sort of thing. We have not yet figured out what requirements are to be applied, but clearly names will need to be verified somewhere to ensure that they are satisying them -- otherwise there might as well be no requirements. So the question becomes, what requirements do we want to set? One possible requirement for NAPTR classic might be that the syntax be defined and checked. What I am suggesting boils down to specifying the hierarchy scheme and ensuring that the NAPTR system escape names to alternate systems as soon as the hierarchy spec is invalidated. > This should make it easier for > people to learn and program in the ``language'' (if such a simple > thing can be called a language!!) while at the same time making syntax > errors less likely. For example, we could use S-expressions (pardon > my rough adherence to standard BNF form..): I can feel an Emacs mode coming on... :-) :-) But of course... :-) > So for example, given a canonical form email address mail:edu.mit.lcs@girod > > ((eat-including ":" &replace-with "") > (eat-until "@" ©) > (eat-including "@" &replace-with ".mail-urn")) > > Would ``compile'' to ``x":"+p"";x"@"-v;x"@"+p".mail-urn";'', and would > generate the pair ("mail-urn.lcs.mit.edu", "girod"). Note that the > translation took care of removing the ``urn:'' if it was there. This > or something like it should be simple enough to learn. And this is simpler that regexps? As a sysadmin I'd much rather use something I'm already used to (ie: regexps) rather than learning (and making mistakes in) another new syntax. Of course those of us that do want regexp are still free to have a "NAPTRclassic" (:-) ) namespace hived off using SRV records that does do regexp replacement, so maybe we can all have our cake and eat it? I am the first to say that the terse syntax is wretched, but it was really easy to implement and it is terse enough to easily fit into a DNS response. Humans are supposed to use the simple user-friendly syntax that compiles into the terse form. With decent management software the underlying terse form should be transparent to the administrator -- in fact, if we can assume that everyone is using a compiler it might be a good idea to make the terse form even more compact. As you say, you can always escape to some kind of proxy that does regexp stuff. The explanation in that message didn't really explain the language, mainly because the terse form is explained in more detail with examples in the proposal document. But conceptually it is pretty simple; each ``statement'' is executed in sequence, and each chops 0 or more chars off the front of the string and then produces a string to catenate into a domain name. The following specify what to chop off: * eat-including: chops right after the first character that appears in the specified delimiter set. * eat-until: chops off right before the first char that appears in the specified delimiter set. * eat-x-chars: eats a specified number of chars. * match-prefix: fails unless the specified string is a prefix of the string, and if it matches eats the prefix. (We may want a case- insensitive version of this as well..) * match-rest: fails unless the rest of the string is an exact match, if it matches eats the rest of the string. Once a prefix is chopped off, it can either be replaced with a specified string (&replace-with) or copied verbatim (©). The result strings is catenated in order and then the result is reversed to produce the domain name to look up. For example, continuing this lisp metaphor: The input is eaten in the following parts: mail: edu.mit.lcs @ girod ^ ^ ^ --> (dns-reverse (catenate "" "edu.mit.lcs" ".mail-urn")) --> (dns-reverse "edu.mit.lcs.mail-urn") --> "mail-urn.lcs.mit.edu" Hope this clarifies things, and thanks a lot for the comments, - Lewis